AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Temporal Localization

# Temporal Localization

Videomind 2B
Bsd-3-clause
VideoMind is a multimodal agent framework that enhances video reasoning capabilities by simulating human thought processes (such as task decomposition, moment localization & verification, and answer synthesis).
Video-to-Text
V
yeliudev
207
1
Cogvlm2 Video Llama3 Chat
Other
CogVLM2-Video is a high-performance video understanding model that achieves state-of-the-art performance in multiple video question-answering tasks, capable of completing video understanding within one minute.
Text-to-Video Transformers English
C
THUDM
2,384
48
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase